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Claims 




A method of setting up one or more nucleic acid sequences encoding one or 
ore (poly)peptide sequences suitable for the creation of libraries of 
ly)peptides said (poly)peptide sequences comprising amino acid 
consensus sequences, said method comprising the following steps: 

(a) deducing Irom a collection of at least three homologous proteins one 
oNriore (poly)peptide sequences comprising at least one amino acid 
consensus sequence; 

(b) optionally, identifying amino acids in said (poly)peptide sequences to 
be modified so as to remove unfavorable interactions between amino 
acids withirvor between said or other (poly)peptide sequences; 

(c) identifying atNeast one structural sub-element within each of said 



(d) 



(poly)peptide sequences; 
backtranslating ^ach of 



said (poly)peptide sequences into a 



corresponding coding^nucleic acid sequence; 



(e) setting up cleavage sites in regions adjacent to or between the ends of 



sub-sequences encoding^ said 
sites: 

(ea) being unique withy* 

(eb) being common t< 
coding nucleic acids. 




b-elements, each of said cleavage 
said coding nucleic acid sequences; 



corresponding sub-sequences of any said 




A method of setting up two or more sets of one or more nucleic acid 
sequences comprising executing the steps\iescribed in claim 1 for each of 
said sets with the additional provision that ^aid cleavage sites are unique 
between said sets. 



The method of claim 2 in which at least two of said sets are deduced from the 
same collection of at least three homologous proteins. 

The method according to any one of cla ims 1 to 3, wherein said setting up 
further comprises the synthesis of said nucleic acid coding^sequences. 



The method according to any one of c laims ^ 1 to 4, further comprising the 
cloning of said nucleic acid coding sequences TnfcTa vector. 



SUBSTITUTE SHEET (RULE 26) 




f 5 ! 



WO 97/08320 PCT/EP96/03647 

6. The method according to any one of claims 1 to 5, wherein said removal of 
unfavorable interactions results in enhanced expression of said 
(poly)peptides. 

7. The method according to any one of cl aims , 1 to 6, further comprising the 
steps of: 

(f) cleaving^t least two of said cleavage sites located in regions adjacent 
to or betwee^n the ends of said sub-sequences; and 

(g) exchanging said sub-sequences by different sequences; and 

(h) optionally, repeating steps (f) and (g) one or more times. 

8. The method according to^ claim wherein said different sequences are 
selected from the group oPdifferent sub-sequences encoding the same or 
different sub-elements derivedvfrom the same or different (poly)peptides. 



q 9. The method according to cl aims 7\pr_ 8, wherein said different sequences are 

III selected from the group of? /\ It 

(i) genomic sequences or sequences derived from genomic sequences; 

(ii) rearranged genomic sequences or sequences derived from 
rearranged genomic sequences; and 

W (iii) random sequences. 

p 10. The method according to any one of claims\l to 9 further comprising the 

expression of said nuctew acid coding sequences. 

11. The method according to any one of claims_l^v10 further comprising the 
steps of: \ 

(i) screening, after expression, the resultant (poly)peptides for a desired 
property; \^ 

(k) optionally, repeating steps (f) to (i) one or more time^ with nucleic acid 
sequences encoding one or more (poly)peptides obtained in step (i). 

1 2. The method according to claim 1 1 , wherein said desired property is selected 
from the group of optimized affinity or specificity for a target molecule, 
optimized enzymatic activity, optimized expression yields, optimized stability 
and optimized solubility. ^ 
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13. "me method according to any one of claim s 1 tc 12, whsrein said cleavage 
sitesvare sites cleaved by restriction enzymes. 

14. The metnod according to any one of cla ims 1 to 13 f wherein said structural 
sub-elemente comprise between 1 and 150 amino acids. 

15. The method according to claim 14, wherein said structural sub-elements 
comprise betweer^^and 25 amino acids. 

16. The method according^to any one of claims 1 to 15, wherein said nucleic acid 
is DNA. 

17. The method according tb^ any one of claims 1 to 16, wherein said 
(poly)peptides have an amipo acid pattern characterTstic of a particular 
species. 

18. The method according to ^laim 17,)iAtf^ein said species is human. 

of claims 1 to 18, wherein said 
of\ members or derivatives of the 




19. The method according to an 

(poly)peptides are at least part 
immunoglobulin superfamily. 



20. The method according to claim 19, wherein\said members or derivatives of 
the immunoglobulin superfamily are members or derivatives of the 
immunoglofc ^n family. 

21 . The method according to claim 19 or 20, wherein x said (poly)peptides are or 
are derived from heavy or light chain variable regions wherein said structural 
sub-elements are framework regions (FR) 1, 2, 3, or 4 or complementary 
determining regions (CDR) 1, 2, or 3. 

22. The method according to claim_2ILpr m 21 t wherein said (p^ly)peptides are or 
are derived from the HuCAL consensus genes: \ 

Vk1, Vk2, Vk3, Vk4, VM, V\2, VX3, VH1 A, VH1B, VH2, VH3, VH4. VH5 t VH6, 
Ck, CX, CH1 or any combination of said HuCAL consensus genes. 

23. The method according to any one of ciaims-2A s to^22, wherein saidNderivative 
of said immunoglobulin family or said combination is an Fv, disulphiae-linked 
Fv, single-chain Fv (scFv), or Fab fragment. Y 
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24. >he method according to clai ms 22 to 2 3, wherein said derivative is an scFv 
fragment comprising the ^combination of HuCAL VH3 and HuCAL VX2 
consensus genes that comprises a random sub-sequence encoding the 
heavy\hain CDR3 sub-element. 

25. The method according to any one of cl aims 1 to 24, wherein at least part of 
said (poly)p^ptide sequences or (poly)peptides is connected to a sequence 
encoding at lea^t one additional moiety or to at least one additional moiety, 
respectively. 

26. The method according to cl airn 2 5. wherein said connection is formed via a 
contiguous nucleic acid sequence or amino acid sequence, respectively. 

27. The method according to^laims 25 to 2 6, wherein said additional moiety is a 
toxin, a cytokine, a reporter enzyme, a moiety being capable of binding a 
metal ion, a peptide, a tag^suitable for detection and/or purification, or a 
homo- or hetero-association domain. 

28. The method according to any dne o^fclaims 10 to 27, wherein the expression 
of said nucleic acid sequence^^resplts in the generation of a repertoire of 
biological activities and/or specificities, preferably in the generation of a 
repertoire based on a universal framework. 

29. / nucleic acid sequence obtainable by the^method according to any of claims 
1 to 28 . 

30. A collection of nucleic acid sequences obtaina£l v e by the method according to 
any of claims 1 to 28. 

31 . A recombinant vector . obtainable by the method according to any of claims 5 
to28. 

32. A collection of recombinant vectors obtainable by the method according to 
any of c laims 5 to 30. 

33. A host cell transformed with the recombinant vector accordingao claim 31. 
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34. >A collection of host cells transformed with the coliectior, of recombinant 
vectors according toclaim32. 

35. A method of producing a (poly)peptide or a collection of (poly)peptides as 
defined\n any of claims 1 to 28 comprising culturing the host cell according to 
claim 33 qr the coHectiorToPhost cells according to claim 34 under suitable 
conditions \ and* isolating said (poly)peptide or said collection of 
(poly)peptides. 



36. A (poly)peptide devisable by the method according to any one of claims 1 to 
3, encoded by the nucleic acid sequence according to claim 29 or obtainable 
by the method according to any one of claims ^ to 28 or 35. 



38. 



\ 



37. A collection of (poly)peptides devisable by the method according to any one 
of cl aims 1 to 3, encoded by the collection of nucleic acid sequences 
according to claim 30 or obtainable by the method according to any one of 
claims 4 to 28 or 35. 



A vector suitable for use in th 
35 characterized in that said 
as defined in claim 1(e) and 2 




3d according to any of claimsJLtoJ28 and 
essentially devoid of any cleavage site 



39. The vector according to claim 38 which is an expression vector. 



40 . A kit comprising at least one of: 

. (a) a nucleic acid sequence according to\claim 29; 

(b) a collection of nucleic acid sequei 

(c) a recombinant vector according to claim^31 ; 
a collection of recombinant vectors according to claim 32; 
a (poly)peptide according to claim 36; \ 



(d) 
(e) 
(f) 
(9) 



a collection of (poly)peptides according to claim 37; 
a vector according to claim 38 or 39; and optionally, 



(h) a suitable host cell for carrying out the method according to claim 35 



41 .A method of designing two or more genes encoding a collection of two or more 
proteins, comprising the steps of: 
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(aaV identifying two or more homologous gene sequences, or 
(ab) \nalyzing at least three homologous genes, and 

deducing two or more consensus gene sequences therefrom, 

(b) optionally, modifying codons in said consensus gene sequences 
to remove unfavourable interactions between amino acids in the 
resultingsproteins, 

(c) identifying V sub-sequences which encode structural sub- 
elements in s^id consensus gene sequences 

(d) modifying one or more bases in regions adjacent to or between 
the ends of said sub-sequences to define one or more cleavage 



\ 

sites, each of which 
(da) are unique wi 



(db) do not fo 

sub-sequence, 




each consensus gene sequence, 
Datible sites with respect to any single 



(dc) are common to a I homologous sub-sequences. 



42. A method of preparing two or more genes encoding a collection of two or more 
proteins, comprising the steps of : \ 

(a) designing said genes according to clainrvfl, and 

(b) synthesizing said genes. 

43 .A collection of genes prepared according to the method^of claim 42. 

44 . A collection of two or more genes derived from gene sequences which : 

(a) are either homologous, or represent consensus\gene sequences 
derived from at least three homologous genes, and 

ZZO 
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(b) \carry cleavage sites, each of which: 



(oa) lie at or adjacent to the ends of genetic sub-sequences which 
\ encode structural sub-elements, 

(bb) - are unique within each gene sequence, 

(be) do\ot form compatible sites with respect to any single sub- 
sequence, and 

(bd) are common to all homologous sub-sequences. 



45. 



The collection of genes according to either of claims4^or 44 in which each of 
said gene sequences has\ a nucleotide composTtion^chaVacteristic of a 
particular species. 



46 . The collection of genes accordingly 




47. The collection of genes accordinglto 

more of said gene sequences encodes 
immunoglobulin superfamily, preferably of 



in which said species is human. 



>f claims-43Jo 46 in which one or 
at least part of a member of the 
the immunoglobulin family. 



48. The collection of genes according to dair ri^ 47J t^ which said structural sub- 
elements correspond to any combination of framework regions 1, 2, 3, and 4, 
and/or CDR regions 1 , 2, and 3 of antibody heavy chains. 



49. The collection of genes according to claim 47 in which\said structural sub- 
elements correspond to any combination of framework regions 1, 2, 3, and 4, 
and/or CDR regions 1, 2, and 3 of antibody light chains. 



50. A collection of vectors comprising a collection of gene sequences according 

to any of claims 43 to 49. 

^ s 2-21/ 
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51. The^collection of vectors according to claim 50 comprising the additional 
feature that the vector does not comprise any cleavage site that is contained 
in the collection of genes according to any of claims 43 to 49. 



52. A method for identifying one or more genes encoding one or more proteins 
having a desirable property, comprising the steps of: 

(a) expressingVom the collection of vectors according to either of claims 
50 or 51 a collection of proteins. 

(b) screening said^collection to isolate one or more proteins having a 
desired property,\^ 

(c) identifying the genes encoding the proteins isolated in step (b), 

(d) optionally, excising from the genes encoding the proteins isolated in 
step (b) one or more genetic sub-sequences encoding structural sub- 
elements, and replacing saidjsub-sequence(s) by one or more second 
sub-sequences encoding\str/ictural sub-elements, to generate new 
vectors according to eitmer of claims 50 or 51 , 




(e) optionally, repeating steps (a Mo (c) 



53. A method for identifying one or more genes encoding one or more antibody 
fragments which binds to a target, comprising the steps of: 

(a) expressing from the collection of vectprs according to either of daims 
50 _or 51 a collection of proteins, 

(b) screening said collection to isolate one\or more antibody fragments 
which bind to said target, 

(c) identifying the genes encoding the proteins isolated in step (b) t 

(d) optionally, excising from the genes encodingyhe antibody fragments 
isolated in step (b) one or more genetic sub-sequences encoding 
structural sub-elements, and replacing said sub-s v equence(s) by one or 
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more second sub-sequences encoding structural sub-ger.orste nev> 
/ectors according to either ot claims 50 or 51 , 



(e) opt; 




, repeating steps (a) to (c). 



A kit comprising tWo or more genes derived from gene sequences which: 

(a) are either homologous, or represent consensus gene sequences 
derived from at I^ast three homologous genes, and 

(b) carry cleavage siteV each of which: 

(ba) lie at or adjaceht to the ends of genetic sub-sequences which 
encode structural^sub-elements, 



(bb) are unique within eac 

(be) do not form compatjl 
sequence, and 



equence, 
is.with respect to any single sub- 



(bd) are common to all homologous sub-sequences 




55. A kit comprising two or more genetic sub-sequences which encode structural 
sub-elements, which can be assembled to form genes, and which carry 
cleavage sites, each of which: \ 

(a) lie at or adjacent to the ends of said genetic subsequences, 

(b) do not form compatible sites with respect to any single sub-sequence, 
and 



(d) are common to all homologous sub-sequences. 
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